Which, When, and How: Hierarchical Clustering with Human-Machine Cooperation

نویسندگان

  • Huanyang Zheng
  • Jie Wu
چکیده

Human–Machine Cooperations (HMCs) can balance the advantages and disadvantages of human computation (accurate but costly) and machine computation (cheap but inaccurate). This paper studies HMCs in agglomerative hierarchical clusterings, where the machine can ask the human some questions. The human will return the answers to the machine, and the machine will use these answers to correct errors in its current clustering results. We are interested in the machine’s strategy on handling the question operations, in terms of three problems: (1) Which question should the machine ask? (2) When should the machine ask the question (early or late)? (3) How does the machine adjust the clustering result, if the machine’s mistake is found by the human? Based on the insights of these problems, an efficient algorithm is proposed with five implementation variations. Experiments on image clusterings show that the proposed algorithm can improve the clustering accuracy with few question operations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph Clustering by Hierarchical Singular Value Decomposition with Selectable Range for Number of Clusters Members

Graphs have so many applications in real world problems. When we deal with huge volume of data, analyzing data is difficult or sometimes impossible. In big data problems, clustering data is a useful tool for data analysis. Singular value decomposition(SVD) is one of the best algorithms for clustering graph but we do not have any choice to select the number of clusters and the number of members ...

متن کامل

Toward Mixed-Initiative Email Clustering

Organizing data into hierarchies is natural for humans. However, there is little work in machine learning that explores human-machine mixed-initiative approaches to organizing data into hierarchical clusters. In this paper we consider mixed-initiative clustering of a user’s email, in which the machine produces (initial and retrained) hierarchical clusterings of email, and the user reviews and e...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Human errors identification in operation of meat grinder using TAFEI technique

  Background: Human error is the most important cause of occupational and non-occupational accidents. Because, it seems necessary to identify, predict and analyze human errors, and also offer appropriate control strategies to reduce errors which cause adverse consequences, the present study was carried out with the aim of identifying human errors while operating meat grinder and offer sugg...

متن کامل

Hierarchical Fuzzy Clustering Semantics (HFCS) in Web Document for Discovering Latent Semantics

This paper discusses about the future of the World Wide Web development, called Semantic Web. Undoubtedly, Web service is one of the most important services on the Internet, which has had the greatest impact on the generalization of the Internet in human societies. Internet penetration has been an effective factor in growth of the volume of information on the Web. The massive growth of informat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Algorithms

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2016